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Abstract 

The motivation of this research came about when a neural network direct adaptive control 
schemes was applied to control the tip position of a flexible robotic arm. Satisfactory control 
performance was not attainable due to the inherent non-minimum phase characteristics of the 
flexible robotic arm tip. Most of the existing neural network control algorithms are based on the 
direct method and exhibit very high sensitivity if not unstable closed-loop behavior. Therefore a 
neural self-tuning control (NSTC) algorithm is developed and applied to this problem and 
showed promising results. Simulation results of the NSTC scheme and the conventional self- 
tuning (STR) control scheme are used to examine performance factors such as control tracking 
mean square error, estimation mean square error, transient response, and steady state response. 


1. Introduction 

Self-tuning adaptive control used for controlling unknown ARMA plants has traditionally 
been based on the minimum variance control law and a recursive identification algorithm (Astrom 
and Wittenmark, 1973; Clark and Gawthrop, 1979). Although the advancement in VLSI has 
made it more possible to implement real-time recursive algorithms but it is still computationally 
intensive and expensive due to the recursive nature of the algorithm. On the other hand, neural 
networks VLSI has been made available commercially with extreme processing capability due to 
its parallel architecture. With this in mind the possibility of formulating neural networks to 
perform functions of conventional recursive algorithms becomes important. Hence, in this paper 
we propose the neural self tuning control (NSTC) scheme where the implicit identification is 
performed by a multilayer neural network (MNN) and the control is based on the generalized 
minimum variance (GMV) control law. 

Neural networks have undoubtedly demonstrated its effectiveness in controlling nonlinear 
systems with known/unknown dynamics and uncertainties (Narendra and Parthasrathy, 1990; 
Levin and Narendra, 1993; Werbos et al. 1990; Hunt et al., 1992). In addition, neural network 
adaptive control algorithms have also been developed for specific linear system model such as the 
state space model (Ho et al., 91a) and the ARMA model (Ho et al., 1991b). It was shown in the 
simulation results that neural network controllers produced comparable results to conventional 
adaptive controllers. In this paper, we investigate the performance of the NSTC and compare it to 
the conventional adaptive STR. 


The flexible arm to be controlled is shown in Figure 1.1. There are two system outputs 
that are of interest, one is the hub angle 9h(t) and the other is the tip angle 0t(t) of the arm. The 
goal is to apply a neural network control scheme to control these outputs to track the command 
signals. The neural controller will generate a control voltage signal u(t) that will feed the power 
amplifier in which will force current through the motor and cause the arm position to react. The 
dynamical transfer function of the hub angle is a linear minimum phase system in which will be 
shown readily controllable by a neural network. In fact, the direct adaptive neural control scheme 
in Figure 1.2. can be used to control the hub. This control scheme belongs to the type called 
specialized learning control (Psaltis et al., 1988; Ho et al., 1991c). However, the tip of the arm, 
being in a different location from the actuator point, therefore making the system to be of the type 
non-collocated system. The effect of this dynamically is that there is a zero on the right half side of 
the s-plane. In other words, the transfer function of the tip angle is of the type non-minimum 
phase which presents itself to be very difficult to control when direct adaptive control 
methodology is applied. This difficulty may be due to the controller trying to emulate the inverse 
dynamics of the non-minimun plant and results in an unstable behavior. According to simulation 
studies the specialized learning control algorithm diverges when applied to control the tip angle. 
Most other neural control schemes are also based on the inverse dynamics including the indirect 
learning method by (Psaltis et al., 1988), the feedback error learning by (Kawato et al., 1988), and 
the methods presented by (Narendra and Parthasarathy, 1990). 



Figure 1.1. Flexible arm system 





Figure 1.2. specialized learning control of hub velocity 



Figure 1.3 Indirect neural adaptive control scheme 

In this report, we propose to use the neural self tuning control scheme which is based on an 
indirect control method (Ho et al., 1991c) to control the tip angle. This scheme is shown in Figure 
1.3 where the identification is performed by the MNN and the control is performed by the 
generalized minimum variance (GMV) controller. The GMV control algorithm has a dynamic 






weighting function Q(q 4 ) applied to the plant control signal u(k) in the cost function to limit and 
condition the control energy. Thus, upon selecting the proper weighting function the controller can 
be input/output stable and effective in controlling the non-minimum phase plant. In section 2 the 
neural self-tuning control (NSTC) which consists of the minimum variance control algorithm and 
the neural identification is presented. Section 3 covers the basic dynamics of the flexible arm tip 
position. Section 4 presents a comparative simulation study of the adaptive STR scheme and the 
NSTC scheme. And section 5 gives the conclusion of the results found in this study and address 
the advantages and disadvantages of the neural control scheme used for treating linear system. 


2. Stochastic neural self-tuning adaptive control (NSTC) 

The NSTC consists of the minimum variance control law and the neural identification 
algorithm. The model assumed for the plant is of ARMA input/output type having the form 


y(k) = q' d u(k) + £(k) (2. 1) 

A(q l ) A(q ) 

where u(k), y(k), £(k), and d are system input, output, uncertainty, and delay, respectively. A, B, 
and C are unknown system dynamics defined as 

A(q _1 ) = 1 + ajq' 1 + a 2 q‘ 2 + ... + a na q' na (2-2) 

B(q -1 ) = b 0 + bjq* 1 + b 2 q’ 2 + ... + b nb q' nb (2.3) 

C(q' 1 ) = 1 + Cjq’ 1 + c 2 q’ 2 + ... + c nc q nc (2.4) 


where q is the shift operator. For the above unknown plant, in Figure 1.3, the objective is to 
control its output to track a command signal y*(k) based on the generalized minimum variance 
control index (Clark and Gawthrop, 1979) 

J(k+d)=E{<j> 2 (k+d)} 

= E { [P(q- 1 )y(k+d)+Q(q- 1 )u(k)-R(q' 1 )y*(k)] 2 } 

= E { [4> y (k+d)+Q(q" 1 )u(k)-R(q’ ! )y*(k)] 2 } 


(2.5) 


where E is the expectation operator, $ (k+d) is the auxiliary output, and P,Q, and R are the 
weighting dynamics which can be chosen depending on the required response characteristics. 


2.1. Generalized minimum variance control 

In this section, the generalized minimum variance self-tuning control algorithm for the 
above problem statement is summarized (Clark and Gawthrop, 1979). To obtain the optimal 
control u(k) which minimizes the performance index (2.5), the predictive auxiliary output <J» y (k-Kl) 

in terms of the system dynamics must be determined. Consider the following identity 
A(q-‘) W q A(q->) 

where the order of F(q-l) and G(q- 1 ) are nf=d-l, ng=na-l, respectively. The output prediction can 
be shown to have the form 

(|> y (k-Kl) = $ y (k+d) + £ y (k+d) (2- 1 .2) 

where 

$ y (k+d) = C(q- 1 )* 1 [G(q- 1 )y(k) + F(q‘ 1 )B(q- 1 )u(k)] 

= C(q- 1 )' 1 [G(q* 1 )y(k) + E(q- 1 )u(k)] (2.1.3) 


and 


<j> y (k+d) = F(q‘ l )£(k+d) (2-1-4) 

$ y (k+d) and ^ y (k+d) are the deterministic and uncorrelated random components of <j> y (k+d). 
Next, substituting (2.1.2) into (2.5), there results 

J(k+d) = E{[$ y (k+d)+Q(q' 1 )u(k)-R(q' 1 )y*(k)] 2 } +E [^y( k+d )] 2 l (2.1.5) 

Since the second term in (2.1.5) is unpredictable random noise which is uncompensatable by the 
control input u(k), and the first term is a linear function of u(k), J(k+d) can be minimized by 
setting 


( 2 . 1 . 6 ) 


[$ y Ck+d)-»-Q(q- ^uC^-RCq" by^Ck)] = 0 

Solving for the generalized minimum variance (GMVC) control in (2.1.6) gives 


u(k) = 


R(q~ 1 )y*O c H v ( k + d ) 
~ Q(q* 1 ) 


(2.1.7) 


using (2.1.3), (2.1.7) can also be written as 

■■m _ C(q~ ^RCcT 1 )y*(k)-G(q~ 1 )y(kl (2.1.8) 

( E(q‘ 1 )+C(q’ 1 )Q(q" 1 ) 

Remarks . Recall that E(q* 1 ) is equal to F(q _1 )B(q* 1 ) where B(q**) contains the zeros of the 
plant. Notice that having the weighting function Q(q _1 ) additive to E(q- [ ) in (2.1.8) gives the 
designer the ability to alter the poles of the controller. Thus with a non-minimum phase plant 
B(q !) shall have unstable roots and proper selection of Q(q _1 ) in (2.1.8) can assure the control 
signal u(k) to be bounded. 


2.2. Neural system identification 

In this section, a stochastic neural identification algorithm is developed for the self- 
tuning control scheme in Figure 1.3. Recall the predicted auxiliary output in (2.1.3) which can 
also be written as 

<J>y(k+d) = C(q- 1 )‘ 1 [G(q 4 )y(k) + E(q‘ 1 )u(k)] + F(q' ! )^(k+d) 

= C(q* 1 )" 1 [G(q‘ 1 )y(k) + E(q- 1 )u(k)] + v(k) (2.2. 1) 

where the uncorrelated noise sequence F(q- 1 )^(k+d) is replaced by v(k). Also (2.2.1) can be 
written as 


ng ne 

<t> (k+d) = ^gjy(k-i) + ^e-u(k-i) - / c ; <t>y(k-Kl-i) + v(k) (2.2.2) 

i=0 i=0 i=l 


<}> y (k-Kl) = \|/'(k)0(k)+ v(k) 


(2.2.3) 
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where 


\|/'(k) = [y(k)...y(k-ng); u(k)...u(k-ne); <|>y(k4d-l)...<|>y(k+d-nc)] (2.2.4) 


0 ’(k) = [g 0 gj — g ng > e o e l - e ne ; ' C 1 ’ c 2 *" ' c nc^ 


(2.2.5) 


since the parameter vector 0 is unknown, the estimated form of <t> y (k+d) is given as 

$ y (k+d) = \jf’0O§(k) (2.2.6) 

where 

V'(k) = [y(k)...y(k-ng); u(k)...u(k-ne); $ y (k+d-l)...$(k+d-nc)] (2.2.7) 


9 , (k) = [g 0 g 1 - gng-’ e O e l - 


A A A A 

e ne ; ' C 1 ‘ c 2 *•* ^nc* 


( 2 . 2 . 8 ) 


The unknown parameter vector in (2.2.8) (Figure 2.1), is taken from the output of the neural 
network 

fl(k) - [ &l(k) &2(k> ... &j(k) ... &„3(k)]' 

= [Ol(k) 02(k) - Oj(k) ... On3(k)]' (2.2.9) 


Where n3 is the number of neurons at the output layer. Consider the system identification cost 
function 


V(k) =^E{e'(k)A~l(k)E(k)} 


= i E{ [4>y(k)-$ y (k)]’A- 1 (k)[<J>y(k)-$y(k)] } (2.2.10) 

where A(k) is a symmetric positive definite weighting matrix, and V(k) is minimized by adjusting 
the weights of the neural identifier. 
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NEURAL 

NETWORK 


Figure 2.1. Neural network structure 

In Figure 2.1, the weights connecting the second layer to the output layer, using the gradient 
search (Rumelhart and McClelland, 1987), can be updated as 

coij(k+l) = coij(k) + Acoij(k) (2.2.1 1) 


where 

AcoijCk) = [\ e'(k)A-l(k)e(k)} 

J acoij(k) L 

= -Tl ^ {^[4» y (k)-$y(k)]'A-l(k)[<|»y(k)-$y(k)] 

9coij(k) L 

= T] ^ ,(k) ^y' ( -^A-l(k)[({)y(k)-$y(k)] (2.2.12) 

3toij(k) 50(k) 

with T| being the search step size. Consider the derivative of y(k) with respect to ^(k) in (2.2.12) 


8&(k) 


= Y(k-d) 


(2.2.13) 


In (2.2.13) we have assumed that 6(k) ~ 0(k-d), that is, 6 is slowly time varying with respect to 
the delay time d. The other partial derivative in (2.2.12) can be determined as 


dfr(k) 

3o)ij(k) 


0 . (k y. 3[f(Netj(k))r 

~ °A k > e J 3Netj(k) 


(2.2.14) 


where f(.) is the sigmoidal activation function, Oi(k) is the output of the second layer, and 


Netj(k) = [netl net2 ... netj ... net n 3l' 


(2.2.15) 
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with 


n2 

netj(k) = Xcoij( k )Oi( k ) 

where n2 is the number of neurons of the second hidden layer as shown in Figure 2.1. Also ej in 
(2.2.14) is defined as 

ej = [0...0 1 0...0] (2.2.16) 

with the j-th element in ej being 1, and other elements are 0. Thus, substituting (2.2.14) back into 
(2.2.12) gives 

Acoij(k) = riej5j(k)Oi(k) ( 2 - 2 - 17 ) 


where 

< 2 ' 218 > 

Next, the weights connecting the first to the second layer, in Figure 2.1, can be updated by the 
recursive equation 

(Ori(k+l) * a>ri(k) + Ao)ri(k) (2.2.19) 


where 

Acori(k) = -T) ~ 4e'(k)A-l(k)e(k)} ( 2 - 2 - 2 «) 

9tOri(k) z 

using the similar back propagation approach, (2.2.20) can be shown to result in the following form 
Acori(k) = q5i(k)0r(k) (2.2.21) 


where Or is the output of the first layer and 


_ 3f[neti(k)] _ 

8i(k)=[o)ii.„coij...a)in3] 9n e ti(k) 


( 2 . 2 . 22 ) 


Lastly, the weights connecting the input to the first layer, in Figure 2.1, can be updated by 
the recursive equation 
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cosr(k +1 ) = ®sr(k) + Aco sr (k) (2.2.23) 

where 

AcosKk) = ■'Hr ~ {A e'(k)A'l(k)e(k)) (2.2.24) 

3o) sr (k) L 


Again, using the back propagation approach, (2.2.24) can be determined as 


Aco S r(k) = n5r(k)I s (k) 


(2.2.25) 


where I s (k) is the input from the delay network and 


5f(k) = [corl...cori...com2] 


9f(net r (k)) 3[f(Netj(k))]' 3Netj(k)' 
3netr(k) 3Neti(k) 3[f(Neti(k))]’ °J w 


(2.2.26) 


with Neti(k) being defined similarly as Netj(k) in (2.2.14). By adjusting the weights (Oij(k), 
cori(k), and co S r(k) with the above algorithm, the unknown implicit plant's parameters can be 
identified and obtained at the output of the neural identifier, as shown in Figure 2.1. Once the 
estimate of 0 is available, $ y (k+d) in (2.2.6) can be computed, and then the control signal can be 

generated using (2.1.7) as 


u(k) = 


R(q~ 1 )y*(k)~$ v (k+d) 

Q(q 3 4 ) 


(2.2.27) 


3. Flexible arm tip position dynamics 

This section describes the components and the control model of the flexible arm tip. A 
detailed discussion of the dynamics of flexible arm tip and hub can be found in (Fraser and 
Daniel, 1991). In order to control the flexible robotic arm shown in Figure 1.1, it is required that 
the control action produced by the control program running on a processor board is converted to a 
voltage by the D/A board and forms the input to the power amplifier of the motor. The output of 
the power amplifier is a motor current directly proportional to the input voltage. The motor then 
converts this current to a torque to drive the arm. The resulting motion of the arm is detected by 
the various sensors and fed back to the controller. 


The adaptive control algorithm design does not require the complete knowledge of the plant 
dynamics. However, for the purpose of simulation study, the transfer function model of the plant 
needs to be known. This model must incorporate not only the behavior of the flexible arm itself 
but also the power amplifier, the motor and the output sensors. In a servo system, the power 
amplifier and the sensors usually have a much higher bandwidth than that of the motor and load 
therefore they can be approximated as a constant The general transfer function of the flexible arm 

tipis 

e t (s) _ k a k t t t (i-s 2 /q?i) 

u(s) S(s+c 0 )f = | (i+2Cis/o)i+s 2 /tof) (3.1) 

where the physical interpretation of the above equation is as follows: 

First, poles and zeros of the system is depicted in Figure 3.1 below 


s-plane 



Figure 3.1 Pole-zero diagram of flexible arm tip 
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The above diagram shows the three constituting dynamic components of the plant which are the 
motor, the resonant modes of the flexible arm, and the arm non-minimum phase characteristics. 
The dynamics of the servo motor system is represented by the term 


K A K t 

(S+C 0 )s 


(3.2) 


where Kj. is the motor torque constant, K A represents the power amplifier and sensor gain, and Co 

represents the back emf and viscous damping effects know as the mechanical time constant. The 
motor can be seen as a series of subcomponent connected in series as shown in Figure 3.2. 


motor motor 

velocity position 



Figure 3.2 Servo motor system components 
Next, the the flexible arm attached to the motor shaft is describe by the term. 

j-r (l-s 2 /a?i) 

1= T (l+2Cis/a)i+s 2 /to?) (3.3) 

Here, the denominator of (3.3) represents the set of flexible resonant modes of the arm. Each 
flexible mode is associated with the corresponding damping £i at a frequency G*. In theory, there 
is an infinite number of flexible modes, but in practice only the sufficiently low frequency modes 
will be noticeable by the control system. This is because a real system is always band-limited 
therefore most of the modes are attenuated by the low-pass frequency behavior. Also, the 
frequency range of operation can be limited to be below the major dominant resonant mode so that 
oscillation will not be present in the system response. If higher frequency range of operation is 
desired, the dominate resonant modes can be notch filtered out provided their damping Ci’ s and 
frequencies COi's are determinable. 

Consider the physical properties of the flexible arm and the servo system given in Table 
3.1. Based on these parameters the transfer function was derived and measured by experiment 
(Fraser and Daniel, 1991). Both results agreed as shown in Table 3.2. The five resonant modes 
occupy the frequency range from 86 rad/sec to 1445 rad/sec. The frequency response of this 
system was simulated and is shown in Figure 3.2. The peaks represent the resonant energy at the 
specific frequencies. Also notice that the energy of the modes lessens are the frequency increases. 



Physical properties of arm and motor 


effective beam length (m) 


0.386 


beam thickness (mm) 


0.956 


beam width 


0.03 



total Inertia (kg m A 2 


0.000041 
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For easy controllability it is desirable to filter out these resonance modes. Therefore, a notch filter 
is designed to notch out the first resonance mode and a low pass filter is used to filter out the rest 
of the resonance mode. Figure 3.3 shows a block diagram of the filtering process. The resulting 
frequency ideal response is shown in Figure 3.4. 


notch filter 


flexible arm 

LP filter system 



Figure 3.3 . Frequency response of open-loop components 



Figure 3.4 Frequency response of the aggregate filtered open-loop 

Since we are primarily interested in learning the controllability and behavior of the non-minimum 
phase characteristics of the plant, we can simplify the arm tip transfer function to have the form 
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6 t (s) ^ K A K T (l-s 2 /a 2 ) 

u(s) s(s+co) (3.4) 

Lastly, the non-minimum characteristics of the arm tip is describe in (3.1) and (3.4) by the 
numerator term. 

(l-s 2 /a 2 ) 

This is due to the fact that the control system sensing and actuation do not take place at the same 
location and therefore being a non-collocatted system. It should be mentioned that the non- 
minimum phase characteristics is very difficult for the neural network to control (since most neural 
network adaptive control schemes are based on the direct method). 


4. Empirical studies 

In this section we examine some simulation results of the direct and indirect neural control 
schemes for controlling the flexible arm hub and tip. We will show that the hub having a well 
behaved linear transfer function produced very satisfactory controlled response. We also attempted 
to use the direct adaptive control scheme to control the tip velocity and found unstable response 
even after numerous controller parameter changes. Next, the NSTC scheme in section 2 was 
applied to control the tip position and produced encouraging results. Lastly, the neural identifier in 
the NSTC algorithm is compared with the recursive least square identifier and show faster 
convergent rate. 


4.1. Neural direct adaptive control of arm hub and tip 


The neural direct adaptive control scheme was first introduced by (Psaltis et al., 1988) and 
was later reformulated for nonlinear/linear state space system by (Ho et al., 1991c). We will apply 
this scheme, shown in Figure 4.1. to control the hub velocity of the arm. 


The dynamic transfer function of the hub is a linear minimum phase system. The numerical 
transfer function found in (Fraser and Daniel, 1991) is 


6 h (s) _ 

U(s) 


10.2 (1+- ^r) 

32.7 2 


(s+0.57)(s+2000) 


(4.1) 
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Figure 4. 1 . Neural direct control scheme of hub velocity 

where the resonant modes are assumed to be filtered out. In the simulation process , the model in 
(4.1) was first discretized and then converted to state space form 

x(k+l) = Ax(k) + Bu(k) (4.2) 

0 h (s) = Cx(k) 

When using this scheme (Figure 4.1.) there is a priori information that is needed and that is the 

ae h (k) 

jacobian of the plant du(k) This term was computed based on the discretized model and resulted 
as 

30h(k) _ 

3u(k) (4.3) 

Information on the neural network algorithm is refered to (Ho et al., 1991c). 

Remarks! '■ The hub position was not suitable for this specialized learning control scheme because 
the jacobian turns out to be near zero. Therefore the velocity is the selected controlled variable and 
an additional outer control loop may be incorporated to achieve position control. This outer loop 
will have a velocity profile generator which resembles to a proportional controller with saturation 
(Franklin and Powell, 1981). 

Simulation: A smoothed square wave command was presented to the control system, after 50 
iterations (about .3 seconds, sampling period was 6 ms) the hub had tracked the command signal 
as shown in Figure 4.2 where the solid line is the desired response and the dashed line is the actual 


response. 
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Figure 4.2. Hub velocity response: 9t00&®t(k) 


This trackability is reflected in the mean square tracking error shown in Figure 4.3. Notice that the 
convergent time in control application is serveral orders of magnitude faster than other 
applications. In this case it took only 50 iterations for the 2-layer neural network to be maturely 
trained with initial random weights. This fast convergent time makes it very practical for real-time 
control implementation. 



Iterations 


Figure 4.3. Control tracking MSE response 

Next, the same scheme is applied to control the tip velocity. The numerical tip transfer function 
(based on the flexible arm and motor properties in Tables 3.1 and 3.2) is given in (Fraser and 
Daniel, 1991) as 


e t (s) 

U(s) 


3.6 (1+-Si-) 
48. 4 2 
s(s+0.16) 


(4.4) 


Here again, we are primarily interested in the non-minimum phase characteristics and therefore 
assumed that the resonant modes are filtered out 
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Simulation: After numerous attempts to vary the neural network parameters, an unstable closed- 
loop response was prevalent as shown in Figures 4.4 and 4.5. This is due to the fact that the 
neural network in Figure 4.1. trying to emulate the inverse dynamics of the plant (4.4.) and in 
effect produced an unstable pole behavior. Note in Figure 4.4. that the command signal is small 
compared to the plant diverging output response therefore it looks like a straight line. 



Iterations 


Figure 4.4. Unstable response of tip control: 0t*(k) and 0t(k) 



Figure 4.5. Diverging tracking MSE of tip velocity 

Neural Network: The N 5,10,1 neural network used in this scheme consists of one input layer, one 
hidden layer, and one output layer with the number of neurons as 5, 10, and 1, respectively. Also 
at the input of the neural was the desired response vector [y*(k) y*(k-l) y*(k-2) y*(k-3) y*(k- 
4)]T. The parameters of the sigmoidal activation function at the output node was found to be most 
influential on the tracking error convergent rate. Predominantly the slope of the activation function 
was observed to be proportional to the convergent rate. Also the bipolar sigmoidal saturation 
levels of the output neuron needed to be set equal to or greater than the maximum allowable plant 
input. The tuning of the sigmoidal functions was done manually by trial and error, typically for 
linear system like that of the hub, it takes very few tweaks (around 1 or 2) before the tracking 
results was achieved. Auto-tuning of the sigmoidal function parameters can also be applied to 
obtain statiscally better results (Yamada and Yabuta, 1992; Proano, 1989). 
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4.2. Neural self-tuning adaptive control (NSTC) of tip position 

In section 4.1. we showed by simulation that the direct neural adaptive control scheme was 
unable to control the tip position (Figures 4.4 and 4.5). In fact, this was why the NSTC algorithm 
was developed. Recall that this scheme has two distinct functions, identification and control, 
which are done by the neural network and the (GMV) control, respectively. The NSTC scheme is 
shown again in Figure 4.6. 



Figure 4.6. NSTC scheme block diagram 

In this section we perform the simulations of two schemes which are: The adaptive STR 
using recursive least square identification, and the NSTC using the neural identification. This is so 
that a comparative study can be done to assess the performance of the developed NSTC. 



Simulation: The model of the tip position is the discretized model of (4.4). Recall the control 
index defined in section 2 


23 


J(k+d) = E { <t> 2 (k+d) } 

= E{rP(q’ 1 )y(k+d)+Q(q' 1 )u(k)-R(q" 1 )y*(k)] 2 } (4.5) 

where the weighting functions were chosen as 

P(q 1 )=l; Q(q-l)=.l+.06q- 1 ; R(q' 1 )=l (4.6) 

* 

and the desired hub position 0 t (k) wa s a step command. Beginning with Figure 4.7. showing the 
desired step tip response, the controlled tip response based on the adaptive STR and the tip 
response from the NSTC. Obviously both controllers manage to track the command signal. 
However, the NSTC seems to have a slower settling time. Figure 4.8. shows the converging 
tracking control index (2.1.5) where both schemes seem very comparable to each other. Figure 
4.9. displays the comparable control energy produced by these controllers. Note that the transient 
control energy was affected by two factors: one is the initial condition of the estimated parameter 

vector (which was set as®o = [l 1 ... 1]' for both control schemes), the further ®o is away 

from the optimum 0 in the parameter state space, the longer the convergence of the tracking 
control index (2.1.5). The other factor is the selection of the input weighting function Q(q _1 ) 
which has the effect of limiting the control energy with the tradeoff of slower tracking 
convergence. Lastly, we compare the recursive least square identification with the neural network 
identification. The two identifiers estimate the parameter vector 0 in (2.2.5) so that the predictive 

output term <l>y( k + d ) in (2.2.2) can be computed. Figure 4.10. shows the estimation cost function 
V(k) in (2.2.10) response of the RLS and the neural network. V(k) of the RLS has a slightly 
faster convergence than the neural network but not by a significant degree. Again, this indicates 
that the identification performance of the two algorithms are comparable to each other. For 

completeness, the time response of the true output 0t(k) and the estimated output ®t(^) produced 
by the neural network is shown in Figure 4.1 1. 

Neural Network: The three layer neural network N 2,5,15,po use d in this scheme consists of one 
input layer, two hidden layers, and one output layer with the number of neurons as 2, 5, 15, and 
P0, respectively. P0 is the length of the vector defined in (2.2.8) which is (ng+l)+(ne+l)+nc, and 
is 11 for the case of the arm tip plant. The input of the neural network was a selected as constant 
vector Is = [1 1]' because it was desired that the output of the neural network to be correlated to the 
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its input The parameters of the sigmoidal activation function at the output node was found to be 
most influential on the tracking error convergent rate. Predominantly the slope of the activation 
function was observed to be proportional to the estimation convergent rate V(k). Also the bipolar 
sigmoidal saturation levels of the output neuron needed to be set equal to or greater than the 
maximum component of the parameter vector 0. The tuning of the sigmoidal functions was done 
manually by trial and error. Autotuning of the sigmoidal function parameters can also be applied to 
obtain statiscally better results (Yamada and Yabuta, 1992; Proano, 1989). However, the optimal 
dimension of the neural network in terms of number of layers and nodes was not known and 

therefore an initial pick of ^2,5,15,po was used throughout the simulation. 









Time (sec) 

Figure 4.10. Identification cost index V(k) of the adaptive STR and the NSTC 



Time (sec) 

Figure 4.1 1. True and neural network estimated tip position: ®t(k) & BtOO 
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5. Conclusion 

The neural self-tuning control (NSTC) algorithm was developed and applied to control the 
tip of a flexible arm system. The dynamics of the flexible arm tip involves an unstable zero and 
therefore making the system non-minimum phase, most of the existing neural adaptive control are 
based on the inverse dynamics and therefore would not be able to control this type of plant. The 
NSTC was based on an indirect control method where the identification is performed by the neural 
network and the control was based on the generalized minimum variance (GMV) control law. The 
performance of the NSTC was investigated and was compared to the adaptive STR by means of 
simulation. 

In summary, the NSTC has a very comparable performance to the adaptive STR shown by 
simulation results in section 4.2. Unlike other applications of neural networks where thousands of 
iterations were required before the network can be maturely trained, in this application the neural 
network identification had a convergent rate comparable to that of the RLS. Another advantage of 
the NSTC is due to the availability of neural network VLSI and the massive parallel architecture of 
the neural network there will be a computation advantage over conventional recursive algorithms. 
This will enable real-time implementation with faster sampling rate for system with high 
bandwidth. Also another advantage of the NSTC is that because the identification is done by the 
neural network it inherits the decentralize property, meaning if there is a failure in a node or 
connection the impact on the performance will be minimal. Whereas with the conventional digital 
filter a failure in one of the coefficient will have a major impact on the output. With all the above 
encouraging characteristics there is one disadvantage of using the neural network and that the the 
lack of understanding how the dimension and activation characteristics of a network is related to its 
accuracy and stability. Whereas these issues of the recursive algorithms have been addressed and 
elaborately analysed (Kumar, 1990). 


6. Future research 

The NSTC can be modified and extended to control systems that are not only non- 
minimum phase but also nonlinear. This is so that the properties of neural networks can be fully 
exploited. A system that have the above characteristics is a two degree of freedom robotic 
manipulator with the second link being flexible. Most conventional adaptive control schemes rely 
heavily on the inverse dynamics and therefore showed great limitations with this type of system 
(Centinkunt and Yu, 1990). 
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APPENDIX 

Simulation Program 

The simulation was perforemed using the software MATLAB. The program shown below is the 
NSTC scheme. 


etc 

clear 

% 

% ■ BEGIN SIMULATION ' " 

N3=1500; % NUMBER OF ITERATIONS 

ndisp=30; 

ALGsr = 1; % 1==> Gradient 2==> Newton 3==>MV 

ALGri =1; % 1=> Gradient 2==> Newton 3==>MV 

ALGij = 1; % 1— > Gradient 2=> Newton 3==>MV 

U>sl ; % 1=> RLS 2=> Neural I.D. 0==>Determistic 

% 

% 

% — INITIALIZATION 

% PLEASE S ELE CT THE DIMENSION OF THE STATE VECTOR X0, 

% INPUT VECTOR U0, OUTPUT VECTOR Y0, AND PARAMETER VECTOR 
% P0 BY MODIFYING THE FOLLOWING STATEMENTS: 

% 

% P0=4; 

POO=l; PS 10=1; 

% " ^ 

% 

% Plant [al a2 a3...ana bO bl b2...bnb]; 

% A = [.7 .5 -.3]'; B=[l .2 -.1 JJt 

% A = [.7 J]t B=[l .2 -.1]*; 

% THETAp = [.7 .5 1 .2 -.1]'; % minimum phase 2nd order plant 

% A = [.7 .5 -3]'; B = [1 .2 -.1 3 It 

% THETAp = [.7 .5 -.3 1 .2 -.1 .3]'; % minimum phase 3rd order 
% THETAp = [.7 .5 -.3 1 .2 -.1 3]'; % non-minimum phase 3rd ; ld=4-10 
% A=[.7 .5 -J]t B=[l 2 -.1 31*; 

% THETAp = [-2.58 2.18 -.5965 -429.7 884.8 -430.8]’; %missile nmp 
% THETAp = [-3.987 5.96 -3.96 .987 -6.94e-5 6.92e-5 6.9e-5 -6.88e-5]’; 

% Mxl 

%THETAp = [-3.87 5.63 -3.64 .882 -.0068 .0066 .0065 -.0063]'; %missile 
%THETAp = [-2.979 2.96 -.979 -.0047 .0094 -.0047]'; %Submarine 
load plant 

num=numd’/dend(l); 

den=dencf/dend(l); 

B=num; 

A=den(2;length(den)); 

% B=numd A=dend(2^) 


THETAp = [A' BT; 


na=length(A);nb=Iength(B)- 1 ;d= 1 ; 

nf=d-l; 

ne=nf+nb; 

ng=na-l; 

nc=0; %This assumes the noise has no dynamics, i.e. C=l; 
PO = (ng+l)+(ne+l)+nc; %Dimension of THETA 

P0p=na+nb+l; %Dimension of plant's THETA 

THETAEST= 1 *ones(P0, 1 ); 

%load thetaest 
THETAO=THETAEST; 
yest=0; w=0; 

P=P00*eye(P0J > 0); 

PSIp=PSIO*ones(POp,l); 

PSId = zeros(PO,l); 

K=1.5*ones(P0,l); 

Yl=zeros(ng+l,l); Ul=zeros(ne+l,l); 

Yc=zeros(ng+l,l); Uc=zeros(ne,l); 

Ylp=zeros(na,l); Ulp=zeros(nb+l,l); 

Ud=zeios(d,l); Yd=zeros(d,l); % delayed values of u, y and w 
Wd=zeros(cL,l); 

y=0; u=0; % output y(k), input u(k) 

VARV = 0; % Output noise variance 

MEANV=0; 


n0=2; nl = 5; n2=15; n3=P0; % Dimensions of Neural Network 

NETr = zeros(nl,l); 

NETi = zeros(n2,l); 

NETj = zeros(n3,l); 

Is = zeros(nO.l); 

Or = zeros(nl.l); 

Oi = zeros(n2,l); 

Oj = zeros(n3,l); 

ALPHAj = .O3*ones(n3,l);, ALPHAi = ,03*ones(n2,l); 

ALPHAr = .03*ones(nl,l); 

Hj = 0*ones(n3,l);, Hi = 0*ones(n2,l);, Hr = 0*ones(nl,l); 

Kj = 3*ones(n3,l);, Ki = 2*ones(n2,l);, Kr = 2*ones(nl,l); 

Wsr = rand(nOail); 

Wri = rand(nl4i2); 

Wij = rand(n2ji3); 

mu=.8; 
lambda0=.99; 
lambdak = .995; 

LAMBDA = 1; 

Pij = 10*ones(n2ai3); 

Pri = 5*ones(nl,n2); 

Psr= 5.3*ones(n0jil); 

Rn = .001; 

Re=.9; 

% 

% = END OF INITIALIZATION = 
randfseed'.lO); 


% 

% BEGIN ITERATION 
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for k=l:N3 

% ===== STOCHASTIC ARMA REPRESENTATION OF A LINEAR PLANT = 
% 

% y(k)+aly(k- l)+...+anay(k-na)=bOu(k-d)+b 1 u(k- 1 -d)+. ..+bnbu(k-nb-d)+ v(k) 

% y(k)=PSIp'(k)*THETAp(k)+v(k) 

% PSIp'=[-y(k-l)...-y(k-na) u(k-d)...u(k-nb-d)] 

% THETAp(k)'=[al a2..ana bO bl b2...bnb] 

% THETA'(k) = [gO gl ... gng eOel...ene] 

% PSId'(k) = [y(k-d).. y(k-d-ng) u(k-d)..u(k-d-ne)] 

% yest(k) = PSId(k)'*THETAEST 

% Confuting THETA 

if <4=1 
E=B* 

G=-A; %g(i)=-a(i+l), i=0..ng G=q-l(l-A) 

end 

THETA = [GET; 


% PARAMETRIZATION FOR PSI(k). 
% 

% 

% 


% u=u(k-l) y=y(k-l) 

for i=d-l:-l:l Ud(i+l)=Ud(i);, end, Ud(l)=u; %[u(k-l)...u(k-d)] 
for i=d-l:-l:l Yd(i+l)=Yd(i);, end, Yd(l)=y; %[y(k-l)...y(k-d)l 
f«w i=d-l:-l:l Wd(i+l)=Wd(i);, end, Wd(l)=w; %[w(k-l)...w(k-d)] 

% PSIp(k) = [-y(k-l).. -y(k-na) u(k-d)..u(k-d-nb)]’ 
for i=na-l:-l:l Ylp(i+l)=Ylp(i);, end; Ylp(l)=-Yd(l); 
for i=nb:-l: 1 Ulp(i+l)=Ulp(i);, end; Ulp(l)=Ud(d); 

PSIp = [Yip' UlpT; 

% PSId(k) = [y(k-d).. y(k-d-ng) u(k-d)..u(k-d-ne)] 
for i=ng:-l:l Yl(i+l)=Yl(i);, end; Yl(l)=Yd(d); %[y(k-d)„ y(k-d-ng)] 
for i=ne:-l:l Ul(i+l)=Ul(i);, end; Ul(l)=Ud(d); %[u(k-d)..u(k-d-ne)] 
PSId = [Yl* U1T; % PSI(k-d) 


% ■■ — 

% GENERATING NOISE v(k) — 

randfnormal') 

v=sqrt(V AR V)*rand(l ,1 KMEANV; 

% 

% COMPUTING y(k) & w(k) 

% 


y=PSIp'*THETAp+v; %y(k) 
tau=.5; 

w=tau*w + (l-tau)*2; %*sign(sin(0.004*(k))); %command signal w(k) 
w=(w/2)+l; 

% 

% END OF PLANT 
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% . 

% adaptive estimation 

% -- the STOCHASTIC LEAST SQUARES ALGORITHM (SLA) === 

% 

fjfo , ■ — — — — 1 

% BEGIN ESTIMATION 

% THETAEST = [gO gl ... gng eO el ... ene]' 

yest=PSId'*THETAEST; % PREDICTED OUTPUT yest(k) 

e=y-yest; % PREDICTION ERROR e(k) 

ifID=l 

K=P*PSId*inv(l+(PSId'*P*PSId)); % OPTIMAL GAIN 

THETAEST=THETAEST+K*e; % PARAMETER ESTIMATION 

P=(P-K*PSId'*P); 
end 

% Neural identification — 

ifID=2 

Is(l)=l; 


Netr = Wsr'*Is' 

tempr= (ALPHAr/2).*(Netr+Hr); 
Or = Kr.*tanh(tempr); 

Or(l)=l; 


Neti = Wri'*Or, 

tempi = (ALPHAi/2).*(Neti+Hi); 

Oi = Ki.*tanh(tempi); 

Oi(l)=l; 

Netj = Wij'*Oi; 

tempj = ( ALPHA j/2).*(Netj+Hj); 

Oj = Kj.*tanh(tempj); 

THETAEST = Oj; 

if k=l save thetaest THETAEST, end 
PSI=PSId; 

tempj2 = cosh(tempj).*cosh(tempj); 
tempi2 = cosh(tempi).*cosh(tempi); 
tempr2 = cosh(tempr).*cosh(tempr); 
Fdotj = (Kj.* ALPHA j/2)V(tempj2); 
Fdoti = (Ki.* ALPHA i/2)V(tempi2); 
Fdotr= (Kr.*ALPHAr/2)./(tempr2); 

dj = Fdotj.*PSI; 

PSIij = Oi*dj'; 
di = (Wij*dj).*Fdoti; 

PSIri = Or*di'; 

Q = Fdoti • (Wij*(Fdotj.*PSI)); 
dr = Fdotr .* (Wri*Q); 

PSIsr = Is*dr’; 

if ALGij = 1 % Gradient 

Lij = mu*PSIij/LAMBDA; 
end 


sP «P sP sP 


if ALGij = 2 % Newton 

Sij = (PSIij.*PSIij.*Pij) + (lambdak*LAMBDA*ones(n2,n3)); 
Lij = (Pij.*PSIij)./Sij; 

Pij = (Pij - (Lij.*Sij.*Lij))/lambdak; 
end 

if ALGij = 3 % Minimum Variance 

Sij = (PSIij.*PSIij.*Pij) + Re*ones(n2,n3); 

Lij = (Pij.*PSIij)VSij; 

Pij = Pij - (Lij.*PSIij.*Pij) + Rn*ones(n2,n3); 
end 

if ALGri =1 % Gradient 

Lri = mu*PSIri/LAMBDA; 
end 

if ALGri — 2 % Newton 

Sri = (PSIri *PSIri.*Pri) + (lambdak*LAMBDA*ones(n 1 ,n2)); 
Lri = (Pri *PSIri)ySri; 

Pri = (Pri - (Lri.*Sri.*Lri))/lambdak; 
end 

if ALGri = 3 % Minimum Variance 

Sri = (PSIri.*PSIri.*Pri) + Re*ones(nl,n2); 

Lri = (Pri *PSIri)VSri; 

Pri = Pri - (Lri.*PSIri.*Pri) + Rn*ones(nl ji2); 
end 

if ALGsr — 1 % Gradient 

Lsr = mu*PSIsr/LAMBDA; 
end 

if ALGsr — 2 % Newton 

Ssr = (PSIsr.*PSIsr.*Psr) + (lambdak*LAMBDA*ones(nO,nl)); 
Lsr = (Psr.*PSIsr)./Ssr, 

Psr = (Psr - (Lsr *Ssr.*Lsr))/lambdak; 
end 

if ALGsr = 3 % Minimum Vasrance 

Ssr » (PSIsr.*PSIsr.*Psr) + Re*ones(nO,nl); 

Lsr = (Psr.*PSIsr)./Ssr; 

Psr = Psr - (Lsr.*PSIsr.*Psr) + Rn*ones(nO,nl); 
end 

Wij = Wij + Lij*e; 

Wri = Wri + Lri*e; 

Wsr = Wsr + Lsr*e; 

%LAMBDA = LAMBDA + (e*e'-LAMBDA)/k; 
lambdak = Iambda0*lambdak+(l-iambda0); 

Re = Re + (e*e'-Re)/k; 


for i=n0:-l:2 Is(i)=Is(i-l);, end 
end 

END OF ESTIMATION 

= MINIMUM VARIANCE ADAPTIVE CONTROL 
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% 

% 

% 

% BEGIN ADAPTIVE CONTROL 


for i=ng:-l:l Yc(i+l)=Yc(i);, end; Yc(l)=y; %[y(k).. y(k-ng)] 
for i=ne-l:-l:l Uc(i+l)=Uc(i);, end; Uc(l)=u; %[u(k-l)..u(k-ne)] 


Id — .1; 
ld2= .06; 
ld3=0; 

if ID=0 THETAEST=THETA;, end % Q = Id + q-lld2 

if k<l THETAc=THETA;, else THETAc=THETAEST;, end 
Gest(l:ng+l,l) = THETAc(l:ng+I); % G 
Eqest(l:ne+l,l) = THETAc(ng+2:ng+2+ne); % E 
Eqest(l) = Eqest(l)+Id; % E+Q 

Eqest(2) = Eqest(2)+kl2; 

%Eqest(3) = Eqest(2HW3; 

% u0^)= {w(k)-[gOy(k)+...+gncy(k-ng)] -[elu(k-l)+...+eneu(k-ne)]j/eO 

SUM1 = Eqest(2:ne+l) , *Uc; 

SUM2 = Gest'*Yc; 

%roots(Eqest) 

%brcak 

u=(w-SUM2-SUMl)/Eqest(l); %u(k) 

%u=w; 

% END OF ADAPTIVE CONTROL 

% ' 

% 

% SIMULATION ERRORS 

% SAVE THETA(k) & THETAEST(k) 

fra - j=l;P0 

THETA l(kj)=THETA(j); 

THETAlEST(kj)=THETAEST(j); 

end 

% SAVE y(k) & yest(k) 

Y(k,l)=y; 

YEST(k,l)=yest; 

% SAVE K(k) 

for j=I:P0 
Kl(kj>K(j); 
end 

% SAVE U(k) 

U(k)=u; 

W(k)=Wd(d); 

% 

% 

% THE PARAMETER IDENTIFICATION MSE(k) 

THET AER=THET A 1 -THET A 1 ES T; 
for j=l:P0 

if k=l, TMSE(kj)=THETAER(kj) A 2; else 
TMSE<kj)=TMSE(k- 1 j>+{THETAER(kj) A 2-TMSE(k- 1 j))/k; 
end 
end 



% THE OUTPUT PREDICTION MSE(k) 

YER(k)=y-yest; 

if k=l, YMSE(k)=YER(k) A 2; else 
YMSE(k)=YMSE(k- l)+(YER(k) A 2- YMSE(k- 1 ))/k; 
end 

% THE COST FUNCTION^) J(k) 

YERc(k)=y-Wd(d); 
if k=l, J(k)=YERc(k) A 2; else 
J(k)=J(k- l)+(YERc(k) A 2-J(k-l))/k; 
end 

% == DISPLAY MATRIX 

% THIS M-FILE IS USED TO MONITOR SYSTEM PERFORMANCES DURING 
% SIMULATION. 

% — 

% TRANSFER DATA TO MATRIX DISMAT1 — 

DISMATl(l,l)=k; 

DISMATl(U)=TMSE(k.l); 

DISMATI ( 1 3)=TMSE(k,2); 

DISMAT1 ( 1 ,4)=TMSE(k,3); 

% DISMATl(U)=TMSE(k.4); 

DISMATl(l,6)=YMSE(k); 

DISMAT 1(1 ,7)=U(k); 

% TRANSFER DATA TO MATRIX DISMAT2 

DISMAT2(l,l)=k; 

DISMAT2(U)=J(k); 

DISMAT2(13)=Wd(d); 

DISMAT2( 1 ,4)=y ; 

DISMAT2( 1 ,5)=yest; 

% DISPLAY DISMATI & DISMAT2 

if iem(Mdisp)=0 
home 

dispC k TMSE1 TMSE2 TMSE3 TMSE4 YMSE U(k)') 

disp(DISMATl) 

dispf k J(k) w(k-d) y(k) yest 1 ) 

disp(DISMAT2) 

% [THETA THETAEST] 
end 

% 

% = END OF DISMAT 
%keyboanl 

end % END OF FOR LOOP(k) 

% ::::: END OF ITERATION ::::: 

% 

% SYSTEMS GRAPHICS 

SYGRAF 

% 


% 


END OF SIMULATION 


